AITopics | Harrisonburg

Collaborating Authors

Harrisonburg

Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models

Bazaga, Adrián, Blloshmi, Rexhina, Byrne, Bill, de Gispert, Adrià

arXiv.org Artificial IntelligenceJun-2-2025

Large Language Models (LLMs) have emerged as powerful tools for generating coherent text, understanding context, and performing reasoning tasks. However, they struggle with temporal reasoning, which requires processing time-related information such as event sequencing, durations, and inter-temporal relationships. These capabilities are critical for applications including question answering, scheduling, and historical analysis. In this paper, we introduce TISER, a novel framework that enhances the temporal reasoning abilities of LLMs through a multi-stage process that combines timeline construction with iterative self-reflection. Our approach leverages test-time scaling to extend the length of reasoning traces, enabling models to capture complex temporal dependencies more effectively. This strategy not only boosts reasoning accuracy but also improves the traceability of the inference process. Experimental results demonstrate state-of-the-art performance across multiple benchmarks, including out-of-distribution test sets, and reveal that TISER enables smaller open-source models to surpass larger closed-weight models on challenging temporal reasoning tasks.

large language model, machine learning, temporal reasoning, (17 more...)

arXiv.org Artificial Intelligence

2504.05258

Country:

North America > United States > Kansas (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > Connecticut > Hartford County > Bristol (0.04)
(12 more...)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment > Sports > Soccer (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Unveiling the Mathematical Reasoning in DeepSeek Models: A Comparative Study of Large Language Models

Jahin, Afrar, Zidan, Arif Hassan, Bao, Yu, Liang, Shizhe, Liu, Tianming, Zhang, Wei

arXiv.org Artificial IntelligenceMar-13-2025

With the rapid evolution of Artificial Intelligence (AI), Large Language Models (LLMs) have reshaped the frontiers of various fields, spanning healthcare, public health, engineering, science, agriculture, education, arts, humanities, and mathematical reasoning. Among these advancements, DeepSeek models have emerged as noteworthy contenders, demonstrating promising capabilities that set them apart from their peers. While previous studies have conducted comparative analyses of LLMs, few have delivered a comprehensive evaluation of mathematical reasoning across a broad spectrum of LLMs. In this work, we aim to bridge this gap by conducting an in-depth comparative study, focusing on the strengths and limitations of DeepSeek models in relation to their leading counterparts. In particular, our study systematically evaluates the mathematical reasoning performance of two DeepSeek models alongside five prominent LLMs across three independent benchmark datasets. The findings reveal several key insights: 1). DeepSeek-R1 consistently achieved the highest accuracy on two of the three datasets, demonstrating strong mathematical reasoning capabilities. 2). The distilled variant of LLMs significantly underperformed compared to its peers, highlighting potential drawbacks in using distillation techniques. 3). In terms of response time, Gemini 2.0 Flash demonstrated the fastest processing speed, outperforming other models in efficiency, which is a crucial factor for real-time applications. Beyond these quantitative assessments, we delve into how architecture, training, and optimization impact LLMs' mathematical reasoning. Moreover, our study goes beyond mere performance comparison by identifying key areas for future advancements in LLM-driven mathematical reasoning. This research enhances our understanding of LLMs' mathematical reasoning and lays the groundwork for future advancements

arxiv preprint arxiv, mathematical reasoning, reasoning, (15 more...)

arXiv.org Artificial Intelligence

2503.10573

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > Georgia > Richmond County > Augusta (0.04)
North America > United States > Virginia > Harrisonburg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Reweighting: On the Predictive Role of Covariate Shift in Effect Generalization

Jin, Ying, Egami, Naoki, Rothenhäusler, Dominik

arXiv.org Artificial IntelligenceDec-11-2024

Many existing approaches to generalizing statistical inference amidst distribution shift operate under the covariate shift assumption, which posits that the conditional distribution of unobserved variables given observable ones is invariant across populations. However, recent empirical investigations have demonstrated that adjusting for shift in observed variables (covariate shift) is often insufficient for generalization. In other words, covariate shift does not typically ``explain away'' the distribution shift between settings. As such, addressing the unknown yet non-negligible shift in the unobserved variables given observed ones (conditional shift) is crucial for generalizable inference. In this paper, we present a series of empirical evidence from two large-scale multi-site replication studies to support a new role of covariate shift in ``predicting'' the strength of the unknown conditional shift. Analyzing 680 studies across 65 sites, we find that even though the conditional shift is non-negligible, its strength can often be bounded by that of the observable covariate shift. However, this pattern only emerges when the two sources of shifts are quantified by our proposed standardized, ``pivotal'' measures. We then interpret this phenomenon by connecting it to similar patterns that can be theoretically derived from a random distribution shift model. Finally, we demonstrate that exploiting the predictive role of covariate shift leads to reliable and efficient uncertainty quantification for target estimates in generalization tasks with partially observed data. Overall, our empirical and theoretical analyses suggest a new way to approach the problem of distributional shift, generalizability, and external validity.

artificial intelligence, machine learning, prediction interval, (16 more...)

arXiv.org Artificial Intelligence

2412.08869

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)
(32 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.92)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.92)

Add feedback

Evaluation of OpenAI o1: Opportunities and Challenges of AGI

Zhong, Tianyang, Liu, Zhengliang, Pan, Yi, Zhang, Yutong, Zhou, Yifan, Liang, Shizhe, Wu, Zihao, Lyu, Yanjun, Shu, Peng, Yu, Xiaowei, Cao, Chao, Jiang, Hanqi, Chen, Hanxu, Li, Yiwei, Chen, Junhao, Hu, Huawen, Liu, Yihen, Zhao, Huaqin, Xu, Shaochen, Dai, Haixing, Zhao, Lin, Zhang, Ruidong, Zhao, Wei, Yang, Zhenyuan, Chen, Jingyuan, Wang, Peilong, Ruan, Wei, Wang, Hui, Zhao, Huan, Zhang, Jing, Ren, Yiming, Qin, Shihuan, Chen, Tong, Li, Jiaxi, Zidan, Arif Hassan, Jahin, Afrar, Chen, Minheng, Xia, Sichen, Holmes, Jason, Zhuang, Yan, Wang, Jiaqi, Xu, Bochen, Xia, Weiran, Yu, Jichao, Tang, Kaibo, Yang, Yaxuan, Sun, Bolun, Yang, Tao, Lu, Guoyu, Wang, Xianqiao, Chai, Lilong, Li, He, Lu, Jin, Sun, Lichao, Zhang, Xin, Ge, Bao, Hu, Xintao, Zhang, Lian, Zhou, Hua, Zhang, Lu, Zhang, Shu, Liu, Ninghao, Jiang, Bei, Kong, Linglong, Xiang, Zhen, Ren, Yudan, Liu, Jun, Jiang, Xi, Bao, Yu, Zhang, Wei, Li, Xiang, Li, Gang, Liu, Wei, Shen, Dinggang, Sikora, Andrea, Zhai, Xiaoming, Zhu, Dajiang, Liu, Tianming

arXiv.org Artificial IntelligenceSep-27-2024

This comprehensive study evaluates the performance of OpenAI's o1-preview large language model across a diverse array of complex reasoning tasks, spanning multiple domains, including computer science, mathematics, natural sciences, medicine, linguistics, and social sciences. Through rigorous testing, o1-preview demonstrated remarkable capabilities, often achieving human-level or superior performance in areas ranging from coding challenges to scientific reasoning and from language processing to creative problem-solving. Key findings include: -83.3% success rate in solving complex competitive programming problems, surpassing many human experts. -Superior ability in generating coherent and accurate radiology reports, outperforming other evaluated models. -100% accuracy in high school-level mathematical reasoning tasks, providing detailed step-by-step solutions. -Advanced natural language inference capabilities across general and specialized domains like medicine. -Impressive performance in chip design tasks, outperforming specialized models in areas such as EDA script generation and bug analysis. -Remarkable proficiency in anthropology and geology, demonstrating deep understanding and reasoning in these specialized fields. -Strong capabilities in quantitative investing. O1 has comprehensive financial knowledge and statistical modeling skills. -Effective performance in social media analysis, including sentiment analysis and emotion recognition. The model excelled particularly in tasks requiring intricate reasoning and knowledge integration across various fields. While some limitations were observed, including occasional errors on simpler problems and challenges with certain highly specialized concepts, the overall results indicate significant progress towards artificial general intelligence.

chip design-engineering assistant chatbot, educational measurement and psychometric, table-to-text generation, (15 more...)

arXiv.org Artificial Intelligence

2409.18486

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
North America > United States > Georgia > Clarke County > Athens (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
(31 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Industry:

Leisure & Entertainment (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

Autonomous Hiking Trail Navigation via Semantic Segmentation and Geometric Analysis

Reed, Camndon, Tatsch, Christopher, Gross, Jason N., Gu, Yu

arXiv.org Artificial IntelligenceSep-23-2024

Natural environments pose significant challenges for autonomous robot navigation, particularly due to their unstructured and ever-changing nature. Hiking trails, with their dynamic conditions influenced by weather, vegetation, and human traffic, represent one such challenge. This work introduces a novel approach to autonomous hiking trail navigation that balances trail adherence with the flexibility to adapt to off-trail routes when necessary. The solution is a Traversability Analysis module that integrates semantic data from camera images with geometric information from LiDAR to create a comprehensive understanding of the surrounding terrain. A planner uses this traversability map to navigate safely, adhering to trails while allowing off-trail movement when necessary to avoid on-trail hazards or for safe off-trail shortcuts. The method is evaluated through simulation to determine the balance between semantic and geometric information in traversability estimation. These simulations tested various weights to assess their impact on navigation performance across different trail scenarios. Weights were then validated through field tests at the West Virginia University Core Arboretum, demonstrating the method's effectiveness in a real-world environment.

information, navigation, robot, (14 more...)

arXiv.org Artificial Intelligence

2409.15671

Country:

North America > United States > West Virginia (0.25)
South America > Brazil (0.04)
North America > United States > Virginia > Harrisonburg (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Neural Network Identification of Limnonectes Species and New Class Detection Using Image Data

Xu, Li, Hong, Yili, Smith, Eric P., McLeod, David S., Deng, Xinwei, Freeman, Laura J.

arXiv.org Machine LearningNov-14-2023

As is true of many complex tasks, the work of discovering, describing, and understanding the diversity of life on Earth (viz., biological systematics and taxonomy) requires many tools. Some of this work can be accomplished as it has been done in the past, but some aspects present us with challenges which traditional knowledge and tools cannot adequately resolve. One such challenge is presented by species complexes in which the morphological similarities among the group members make it difficult to reliably identify known species and detect new ones. We address this challenge by developing new tools using the principles of machine learning to resolve two specific questions related to species complexes. The first question is formulated as a classification problem in statistics and machine learning and the second question is an out-of-distribution (OOD) detection problem. We apply these tools to a species complex comprising Southeast Asian stream frogs (Limnonectes kuhlii complex) and employ a morphological character (hind limb skin texture) traditionally treated qualitatively in a quantitative and objective manner. We demonstrate that deep neural networks can successfully automate the classification of an image into a known species group for which it has been trained. We further demonstrate that the algorithm can successfully classify an image into a new class if the image does not belong to the existing classes. Additionally, we use the larger MNIST dataset to test the performance of our OOD detection algorithm. We finish our paper with some concluding remarks regarding the application of these methods to species complexes and our efforts to document true biodiversity. This paper has online supplementary materials.

artificial intelligence, machine learning, new species, (18 more...)

arXiv.org Machine Learning

2311.08661

Country:

Asia > Thailand (0.04)
Asia > Vietnam (0.04)
Asia > Southeast Asia (0.04)
(12 more...)

Genre: Research Report > New Finding (0.94)

Industry:

Health & Medicine (0.94)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling Supply and Demand in Public Transportation Systems

Bihler, Miranda, Nelson, Hala, Okey, Erin, Rivas, Noe Reyes, Webb, John, White, Anna

arXiv.org Machine LearningOct-20-2023

We propose two neural network based and data-driven supply and demand models to analyze the efficiency, identify service gaps, and determine the significant predictors of demand, in the bus system for the Department of Public Transportation (HDPT) in Harrisonburg City, Virginia, which is the home to James Madison University (JMU). The supply and demand models, one temporal and one spatial, take many variables into account, including the demographic data surrounding the bus stops, the metrics that the HDPT reports to the federal government, and the drastic change in population between when JMU is on or off session. These direct and data-driven models to quantify supply and demand and identify service gaps can generalize to other cities' bus systems. Keywords-- transportation systems, bus systems, public transportation, direct ridership models, data driven models, mathematical modeling, neural networks, machine learning, supply models, demand models, machine learning, service gaps, social vulnerability, public transportation access, GIS data, data science, data quality.

artificial intelligence, machine learning, ridership, (16 more...)

arXiv.org Machine Learning

2309.06299

Country:

North America > Canada > Ontario > Hamilton (0.14)
North America > United States > Virginia > Harrisonburg (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Virginia 'shopping cart killer' case flags dating app dangers: They're a 'toy store' for murderers

FOX NewsJan-9-2022, 16:01:48 GMT

Crime Stoppers of Houston Andy Kahan and FOP national vice president Joe Gamaldi react to the nation's growing crime crisis on'Justice w/ Judge Jeanine.' A potential fifth victim has been identified in the "shopping cart killer" case, involving an alleged serial killer in Northern Virginia, that has crime experts warning of the dangers of online dating. Officers believe suspect Anthony Robinson made contact with the victims via dating websites which Crime Stoppers of Houston's Andy Kahan described on "Justice w/ Judge Jeanine" as "toy stores" for murderers. "The dark side of online dating apps are luring in millions of women to, perhaps… mortal danger," he said. "There are no background checks; we all know sex offenders troll it. You're essentially playing Russian roulette with your life when you divulge personal information and continue to go out and meet people that you do not know."

artificial intelligence, killer, social media, (15 more...)

FOX News

Country:

North America > United States > Virginia > Harrisonburg (0.06)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.06)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Improving mathematical questioning in teacher training

Datta, Debajyoti, Phillips, Maria, Bywater, James P, Chiu, Jennifer, Watson, Ginger S., Barnes, Laura E., Brown, Donald E

arXiv.org Artificial IntelligenceDec-6-2021

High-fidelity, AI-based simulated classroom systems enable teachers to rehearse effective teaching strategies. However, dialogue-oriented open-ended conversations such as teaching a student about scale factors can be difficult to model. This paper builds a text-based interactive conversational agent to help teachers practice mathematical questioning skills based on the well-known Instructional Quality Assessment. We take a human-centered approach to designing our system, relying on advances in deep learning, uncertainty quantification, and natural language processing while acknowledging the limitations of conversational agents for specific pedagogical needs. Using experts' input directly during the simulation, we demonstrate how conversation success rate and high user satisfaction can be achieved.

arxiv preprint arxiv, conversational agent, dialogue system, (14 more...)

arXiv.org Artificial Intelligence

2112.01537

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.16)
North America > United States > Virginia > Harrisonburg (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.51)
Questionnaire & Opinion Survey (0.48)

Industry: Education > Teacher Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Introduction To Deep Learning Coursera Github Hse

#artificialintelligenceOct-10-2019, 00:39:15 GMT

Courses The major educational initiative of the JHUDSL is to create open-source online courses delivered through a range of platforms including Youtube, Github, Leanpub, and Coursera. Welcome to the "Introduction to Deep Learning" course! In the first week you'll learn about linear models and stochatic optimization methods. Please note that this is an advanced course and we assume basic knowledge of machine learning. I am currently working as a data science researcher and trainee at Jheronimus Academy of Data Science.

artificial intelligence, deep learning, machine learning, (10 more...)

#artificialintelligence

Country:

North America > United States > Illinois (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Quebec > Montreal (0.04)
(7 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback